Based on the tutorial by Francois Chollet @fchollet https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html and the workbook by Guillaume Dominici https://github.com/gggdominici/keras-workshop
This tutorial presents several ways to build an image classifier using keras from just a few hundred or thousand pictures from each class you want to be able to recognize.
We will go over the following options:
This will lead us to cover the following Keras features:
Data can be downloaded at:
https://www.kaggle.com/c/dogs-vs-cats/data
All you need is the train set
The recommended folder structure is:
data/
train/
dogs/ ### 1024 pictures
dog001.jpg
dog002.jpg
...
cats/ ### 1024 pictures
cat001.jpg
cat002.jpg
...
validation/
dogs/ ### 416 pictures
dog001.jpg
dog002.jpg
...
cats/ ### 416 pictures
cat001.jpg
cat002.jpg
...
Note : for this example we only consider 2x1000 training images and 2x400 testing images among the 2x12500 available.
The github repo includes about 1500 images for this model. The original Kaggle dataset is much larger. The purpose of this demo is to show how you can build models with smaller size datasets. You should be able to improve this model by using more data.
In [1]:
##This notebook is built around using tensorflow as the backend for keras
#!pip install pillow
!KERAS_BACKEND=tensorflow python -c "from keras import backend"
In [8]:
import os
import numpy as np
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Conv2D, Convolution2D, MaxPooling2D, ZeroPadding2D
from keras import optimizers
In [9]:
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
In [10]:
# used to rescale the pixel values from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1./255)
# automagically retrieve images and their classes for train and validation sets
train_generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=16,
class_mode='binary')
validation_generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')
In [5]:
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape=(img_width, img_height,3)))
#model.add(Convolution2D(32, 3, 3, input_shape=(img_width, img_height,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32,(3,3)))
#model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64,(3,3)))
#model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
In [6]:
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
In [7]:
nb_epoch = 30
nb_train_samples = 2048
nb_validation_samples = 832
In [8]:
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
Out[8]:
In [9]:
model.save_weights('models/basic_cnn_20_epochs.h5')
In [10]:
#model.load_weights('models_trained/basic_cnn_20_epochs.h5')
If your model successfully runs at one epoch, go back and it for 30 epochs by changing nb_epoch above. I was able to get to an val_acc of 0.71 at 30 epochs. A copy of a pretrained network is available in the pretrained folder.
Computing loss and accuracy :
In [11]:
model.evaluate_generator(validation_generator, nb_validation_samples)
Out[11]:
Evolution of accuracy on training (blue) and validation (green) sets for 1 to 32 epochs :
After ~10 epochs the neural network reach ~70% accuracy. We can witness overfitting, no progress is made over validation set in the next epochs
By applying random transformation to our train set, we artificially enhance our dataset with new unseen images.
This will hopefully reduce overfitting and allows better generalization capability for our network.
Example of data augmentation applied to a picture:
In [11]:
train_datagen_augmented = ImageDataGenerator(
rescale=1./255, # normalize pixel values to [0,1]
shear_range=0.2, # randomly applies shearing transformation
zoom_range=0.2, # randomly applies shearing transformation
horizontal_flip=True) # randomly flip the images
# same code as before
train_generator_augmented = train_datagen_augmented.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')
In [12]:
nb_epoch = 30
In [ ]:
model.fit_generator(
train_generator_augmented,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
In [ ]:
model.save_weights('models/augmented_30_epochs.h5')
In [15]:
#model.load_weights('models_trained/augmented_30_epochs.h5')
Computing loss and accuracy :
In [16]:
model.evaluate_generator(validation_generator, nb_validation_samples)
Out[16]:
Evolution of accuracy on training (blue) and validation (green) sets for 1 to 100 epochs :
Thanks to data-augmentation, the accuracy on the validation set improved to ~80%
The process of training a convolutionnal neural network can be very time-consuming and require a lot of datas.
We can go beyond the previous models in terms of performance and efficiency by using a general-purpose, pre-trained image classifier. This example uses VGG16, a model trained on the ImageNet dataset - which contains millions of images classified in 1000 categories.
On top of it, we add a small multi-layer perceptron and we train it on our dataset.
In [17]:
model_vgg = Sequential()
model_vgg.add(ZeroPadding2D((1, 1), input_shape=(img_width, img_height,3)))
model_vgg.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
Note : the VGG16 weights file (~500MB) is not included in this repository. You can download from here :
https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3
In [18]:
import h5py
f = h5py.File('models/vgg/vgg16_weights.h5')
for k in range(f.attrs['nb_layers']):
if k >= len(model_vgg.layers) - 1:
# we don't look at the last two layers in the savefile (fully-connected and activation)
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
layer = model_vgg.layers[k]
if layer.__class__.__name__ in ['Convolution1D', 'Convolution2D', 'Convolution3D', 'AtrousConvolution2D']:
weights[0] = np.transpose(weights[0], (2, 3, 1, 0))
layer.set_weights(weights)
f.close()
In [19]:
train_generator_bottleneck = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode=None,
shuffle=False)
validation_generator_bottleneck = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode=None,
shuffle=False)
This is a long process, so we save the output of the VGG16 once and for all.
In [20]:
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck, nb_train_samples)
np.save(open('models/bottleneck_features_train.npy', 'wb'), bottleneck_features_train)
In [21]:
bottleneck_features_validation = model_vgg.predict_generator(validation_generator_bottleneck, nb_validation_samples)
np.save(open('models/bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)
Now we can load it...
In [22]:
train_data = np.load(open('models/bottleneck_features_train.npy', 'rb'))
train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))
validation_data = np.load(open('models/bottleneck_features_validation.npy', 'rb'))
validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))
And define and train the custom fully connected neural network :
In [23]:
model_top = Sequential()
model_top.add(Flatten(input_shape=train_data.shape[1:]))
model_top.add(Dense(256, activation='relu'))
model_top.add(Dropout(0.5))
model_top.add(Dense(1, activation='sigmoid'))
model_top.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
In [24]:
nb_epoch=40
model_top.fit(train_data, train_labels,
nb_epoch=nb_epoch, batch_size=32,
validation_data=(validation_data, validation_labels))
Out[24]:
The training process of this small neural network is very fast : ~2s per epoch
In [25]:
model_top.save_weights('models/bottleneck_40_epochs.h5')
In [26]:
#model_top.load_weights('models/with-bottleneck/1000-samples--100-epochs.h5')
#model_top.load_weights('/notebook/Data1/Code/keras-workshop/models/with-bottleneck/1000-samples--100-epochs.h5')
Loss and accuracy :
In [27]:
model_top.evaluate(validation_data, validation_labels)
Out[27]:
Evolution of accuracy on training (blue) and validation (green) sets for 1 to 32 epochs :
We reached a 90% accuracy on the validation after ~1m of training (~20 epochs) and 8% of the samples originally available on the Kaggle competition !
In [28]:
##Fine-tuning the top layers of a a pre-trained network
Start by instantiating the VGG base and loading its weights.
In [30]:
model_vgg = Sequential()
model_vgg.add(ZeroPadding2D((1, 1), input_shape=(img_width, img_height,3)))
model_vgg.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model_vgg.add(ZeroPadding2D((1, 1)))
model_vgg.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model_vgg.add(MaxPooling2D((2, 2), strides=(2, 2)))
In [31]:
import h5py
f = h5py.File('models/vgg/vgg16_weights.h5')
for k in range(f.attrs['nb_layers']):
if k >= len(model_vgg.layers) - 1:
# we don't look at the last two layers in the savefile (fully-connected and activation)
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
layer = model_vgg.layers[k]
if layer.__class__.__name__ in ['Convolution1D', 'Convolution2D', 'Convolution3D', 'AtrousConvolution2D']:
weights[0] = np.transpose(weights[0], (2, 3, 1, 0))
layer.set_weights(weights)
f.close()
Build a classifier model to put on top of the convolutional model. For the fine tuning, we start with a fully trained-classifer. We will use the weights from the earlier model. And then we will add this model on top of the convolutional base.
In [32]:
top_model = Sequential()
top_model.add(Flatten(input_shape=model_vgg.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
top_model.load_weights('models/bottleneck_40_epochs.h5')
model_vgg.add(top_model)
For fine turning, we only want to train a few layers. This line will set the first 25 layers (up to the conv block) to non-trainable.
In [33]:
for layer in model_vgg.layers[:25]:
layer.trainable = False
In [34]:
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model_vgg.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
In [35]:
# prepare data augmentation configuration . . . do we need this?
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=32,
class_mode='binary')
In [36]:
# fine-tune the model
model_vgg.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
Out[36]:
In [37]:
model_vgg.save_weights('models/finetuning_20epochs_vgg.h5')
In [38]:
model_vgg.load_weights('models/finetuning_20epochs_vgg.h5')
Computing loss and accuracy :
In [39]:
model_vgg.evaluate_generator(validation_generator, nb_validation_samples)
Out[39]:
In [40]:
model.evaluate_generator(validation_generator, nb_validation_samples)
Out[40]:
In [41]:
model_top.evaluate(validation_data, validation_labels)
Out[41]: